The CAP theorem states that a distributed data store can only provide two of three guarantees simultaneously: Consistency (all nodes see the same data), Availability (every request receives a response), and Partition Tolerance (the system continues operating despite network failures).
The CAP theorem, formulated by Eric Brewer in 2000, is a fundamental principle of distributed systems. It states that in the presence of a network partition (P), a distributed system must choose between Consistency (C) and Availability (A). While often misunderstood as a requirement to pick one of the three permanently, the theorem actually applies during network failures. Systems can provide all three in normal operation, but when communication between nodes breaks, they must sacrifice either consistency or availability.
Consistency (C): Every read receives the most recent write or an error. All nodes see the same data at the same time. In a consistent system, after a write, any subsequent read will see that write (or be blocked until it does).
Availability (A): Every request receives a (non-error) response, without the guarantee that it contains the most recent write. The system remains operational even if some nodes are down.
Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped or delayed between nodes. In practice, network partitions are unavoidable in distributed systems.
System: Traditional relational databases (e.g., PostgreSQL with synchronous replication, HBase)
Example: A banking system where you transfer money between accounts. During a network partition, the system will choose consistency over availability—it will reject requests rather than risk showing incorrect balances. If the primary node cannot communicate with its replica, writes fail or the system becomes read-only until the partition heals.
Trade-off: Users may experience errors or timeouts during network issues, but data integrity is preserved.
Real-World: In a stock trading system, you'd rather have a failed transaction than a transaction that double-sells stock due to inconsistency.
System: Cassandra, CouchDB, Amazon DynamoDB (with eventual consistency), Riak
Example: A social media feed where users post updates and view content. During a network partition, the system continues to accept writes and reads from any available node, even if they are temporarily inconsistent. A user in the partitioned region might see old content or not see recent posts, but the system remains operational.
Trade-off: Users always get a response, but they may see stale data. Consistency is eventually restored when the partition heals.
Real-World: A Netflix recommendation engine. If a network partition occurs, users still get recommendations (maybe not the absolute latest viewing history) rather than seeing an error page.
System: Traditional single-node databases (SQLite, single-instance PostgreSQL), or systems that assume no network partitions (rare in distributed systems)
Example: A local application using SQLite. Since there is only one node, there are no network partitions to tolerate. All reads and writes are consistent and available.
Trade-off: This is not possible in a truly distributed system across a network. CA systems either run on a single node or assume network partitions never happen—an assumption that fails in any geographically distributed deployment.
Real-World: A point-of-sale terminal that uses a local database and syncs with a central server later. During network issues, the terminal operates on its local data (CA within the terminal), but overall system consistency is eventually achieved.
Extension: PACELC (If Partition, choose between Availability and Consistency; Else, choose between Latency and Consistency) adds the latency dimension to CAP.
Example: DynamoDB offers "strongly consistent reads" (CP-like) at higher latency, or "eventually consistent reads" (AP-like) with lower latency. Even without a partition, you choose between consistency and latency.
Real-World: A content delivery network. During normal operation, you might accept eventual consistency for lower latency (edge caches may be stale). During a partition, you choose availability (serve stale content from edge) over consistency.